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(54) Method and system for inserting a spread spectrum watermark into multimedia data 



(57) Digital watermarking of audio, image, video or 
muttimedia data is achieved by inserting the watermark 
into the perceptually signifk:ant components of a 
decomposition of the data in a manner so as to be visu* 
ally imperceptibia In a prefeo'ed method, a frequency 
spectral Image of the data, preferably a Fourier trans- 
fam of the data, is obtained. A watenmark is inserted 
into perceptually signiftcaht components of the fre- 
quency spectral imaga The resultant watermarked 



spectral image is sutjjected to an inverse transform to 
produce watermarked data. The watermark is extracted 
from watermarked data by first comparing the water- 
niarked data with the CKiginal data to obtain an extracted 
waternurk. Then, the original watermark, original data 
and the extracted watermark are compared to generate 
a watermark which is analyzed for authenticity of the 
watermark. 
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Description 

The present invention concerns a method of digital 
waternnarking for use in audio, image, video and multi- 
media data for the purpose of authenticating copyright s 
ownership^ identifying copyright infringers or transmit- 
ting a hidden message. Specifically, a watermark is 
inserted into the perceptually most significant compo- 
nents of a decomposition of the data in a manner so as 
to be virtually imperceptible. More specifically, a narrow lo 
band signal representing the watermark is placed in a 
wkleband channel that is the data. 

The proliferation of digitized media such as audio, 
image and video is aeating a need for a security system 
which facilitates the identif cation of the source of the is 
material. The need manifests itself in terms of copyright 
enforcement and identification of the source of the 
material. 

Using conventional ayptographic systems permits 
only valid keyhoUer access to encrypted data, but once so 
the data is encrypted, it is not possible to maintain 
records of its subsequent representation or transmis- 
sion. Conventional cryptography ttierefbre provkjes 
minimal protection against data piracy of ttie type a pub- 
lisher or owner of data or material is confronted witii by 2S 
unautfiorized i^eproduction or distribution of such data or 
material. 

A digital watermark is intended to complement 
ayptographic processes. The watermark is a visible or 
preferaUy an invisisle identification code ttiat is perma- 3o 
nentiy embedded in ttie data. That is. ttie watermark 
remains witti ttie data after any deayption process. As 
used herein tiie terms data and material will be under- 
stood to refer to audfo (speech and music), images 
(photographs and graphics), video (movies or 3s 
sequences of images) and multimedia data (combina- 
tions of ttie above categories of materials) or processed 
or compressed versfons ttiereof. These terms are not 
intended to refer to ASCII representations of text, but do 
refer to text represented as an imaga A simple example 4o 
of a watermark is a visible "seal" placed over an image 
to identify ttie copyright owner. However, the watermark 
might also contain additional information, including the 
identity of ttie purchaser of ttie particular copy of the 
image. An effective watermark shoukJ possess the fbl- 45 
lowing properties: 

1 . The watermark should be perceptually invisUe 
or its presence shouM rK>t interfere witti the material 
being protected. 50 

2. The watermark must be (fifficult (preferably virtu- 
ally imposstbie) to remove from ttie material without 
rendering ttie material useless for its intended pur- 
pose. However, if only partial knowledge is known. 
e.g. ttie exact tocation of ttie watermark within an 55 
image is unknown, ttien attempts to remove or 
destroy ttie watenmark, for instance by adding 
noise, should result in severe degradation in data 
fidelity, rendering ttie data useless, before the 



watermark is removed or lost 

3. The watermark should be robust against collu- 
sion by multiple indvkluals who each possess a 
watermarked copy of ttie data. That is. ttie water- 
mark ShoukJ be robust to the combining of copies of 
ttie same data set to desf oy ttie watermari®. Also, 
it must not be possUe for cdluders to combine 
each of ttieir images to generate a different valid 
watermark. 

4. The watermark shoukJ still be retiievable if com- 
mon signal processing operations are applied to tiie 
data. These operations indude. but are not limited 
to digital-to-analog and anafog-to-digital conver- 
sion, resampling, requantization (including dittier- 
ing and recompressfon) and common signal 
enhancements to image contrast and color, or 
audio bass and treble for example. The watermarks 
in image and vkleo data should be immune from 
geometric image operations such as rotation, trans- 
lation, cropping and scaling. 

5. The same digital watermaric mettiod or algorittim 
ShoukJ be q3plk:able to each of ttie different media 
under consideration. This is particularty*useful in 
watermarldng of multimedia material. Moreover, 
ttits feature is conducive to ttie implementation of 
vkjeo and image/vkJeo watermarking using com- 
mon hardware. 

& Retiieval of the watermark shoukJ unambigu- 
ously kJentify ttie owner. Maeover. ttie accuracy of 
ttie owner kJentificatfon should degrade gracefully 
during attack. 

Several prevfous digital watermarking metiiods 
have been proposed. L F. Turner in patent number 
W089/08915 entitted 'Digital Data Seo^rty System" 
proposed a mettiod for inserting an kJentif foatfon string 
into a digital audio signal by sid)stituting ttie "insignrfi- 
canT bits of randomly selected audfo samples witti ttie 
bits of an kJentification code. Bits are deemed Insignifi- 
cant" if their alteration is inaudUe. Such a system is 
also appropriate for two dimensfonal data such as 
images, as (fiscussed in an article by R.Q. Van Schyn- 
del el al entitied "A dgital watermark* in Inti. Cotrf. on 
Image Processing, vol 2, Pages 86-90. 1994. The 
llimer method may easily be circumvented. For exam- 
ple, if it is known ttiat the algorithm only affects the least 
significant two bits of a word, then it is possible to ran- 
domly flip all such bits, ttiereby destroying any existing 
identification code. 

An article entitted "Assuring Ownership Rights for 
Digital Images' bf GL Caronhi. in Proc. Reliable IT Sys- 
tems, VIS '95, 1995 suggests adding tags - small geo- 
metric patterns-to-digitized images at brightness levels 
ttiat are inperceptible. While the kJea of hkJing a spaticd 
watermark in an image is fundamentally sound, ttiis 
scheme is susceptible to attack by fitering and redigiti- 
zation. The fainter such watermarks are. ttie more sus- 
ceptible ttiey are to such attacks and geometiic shapes 
provkJe only a limited alphabet witti which to encode 
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information. Moreover, the scheme is not applicable to 
audio data and may not be robust to common geometric 
distortions, especially cropping. 

J. Brassii et al in an article entitled "Electronic Mark- 
ing and Identification Techniques to Discourage Docu- 
ment Copying" in Proc. of Infocom 94. pp 1 278-1 ffl7. 
1994 propose three methods appropriate for document 
images in which text is common. Digital watermarks are 
coded by: (l)verticaily shifting text lines. (2) horizontally 
shifting words, or (3) altering text features such as the 
vertical endlines of individual characters. Unfortunately, 
all three proposals are easily defeated, as discussed by 
the authors. Moreover, these techniques are restricted 
exclusively to images containing text. 

An article by K. Janata et al entitled "Embedding 
Seaet Information into a Dithered Multi-level Image' in 
IEEE Military Comm. Conf.. pp216-220. 1990 and K. 
Mitsui et al in an article entiUed Video-Steganography" 
in IMA Intellectual Property Proc.. vl. pp187-206. 1994. 
desaibe several watermarking schemes that rely on 
embedding watemfiarks ttiat resemble quantization 
noise. Their ideas hinge on the notion ttiat quantization 
noise is typically impercepttsle to viewers. Their first 
scheme injects a watermark into an image by using a 
predetermined data stream to guide Jevel selection in a 
predictive quantizer. The data stream is chosen so that 
the resulting watermark kx)ks like quantization noise. A 
variation of tNs scheme is also presented, where a 
watermark in the form of a dithering matrix is used to 
dither an image in a certain way. There are several 
drawl>acks to these schemes. The most important is 
that ttiey are susceptUe to signal processing, espe- 
cially requantization. and geometric attacks such as 
cropping. Furthermore, ttiey degrade an image in the 
same way that predictive coding and dittiering can. 

In Tanata et al. ttie auttiors also propose a scheme 
for watermarking facsimile data. This scheme shortens 
or lengttiens certain runs of data in the run lengtti code 
used to generate the coded fax image. This proposal is 
susceptUe to digital-to-analog and anatog-to dgital 
conversk>ns. In particular, randomizing the least signifi- 
cant bit (LSB) of each pixel's intensity will completely 
alter ttie resulting nr ler>gth encoding. Tanata et al also 
propose a watermarMng method for 'color-scaled pic- 
ture and video sequences'. This mettKxl applies ttie 
same signal transform as JPEQ (OCT of 8 x 8 sub- 
Mocks of an image) and embeds a watermark in the 
coefficient quantization nxxJule. While being compatUe 
with existing transform coders, this scheme is quite sus- 
ceptble to requantization and filtering and is equivalent 
to coding ttie watermark in ttie least significant bits of 
the transform coefficients. 

In a recent paper, by Macq arid Quisquater entitied 
''Cryptok)gy for Digital TV Broadcasting' in Proc. of ttie 
IEEE. 83(6). PP944-957. 1995 there is briefly discussed 
the issue of watermaridng digital images as part of a 
general survey on cryptography and digital television. 
The authors provMe a description of a procedure to 
insert a watermark into the least significant bits of pixels 



located in ttie vicinity of image contours. Since it relies 
on modifications of ttie least significant bits, ttie water- 
mark is easily destroyed. Further, ttie mettiod is only 
applicable to images in ttiat it seete to insert ttie water- 

5 mark into image regions that lie on the edge of contours. 
W. Bender et al in article wititled Techrnques for 
Data Hiding' in Proc. of SPIE. v2420. page 40. July 
1995. describe two watermarking schemes. The first is 
a statistical mettiod called 'Patchwork". Patchwork ran- 

10 domly chooses n pairs of image points (8^. bj and 
inaeases ttie brightiiess at a^ by one unit while coae- 
spondingly decreasing ttie brightness of bj. TTie 
expected value of the sum of ttie differences of the n 
pairs of points is claimed to be 2n. provided certain sta- 
rs tistical properties of the image are true. In particular, it is 
assumed ttiat all brightness levels are ec^^y litaly, ttiat 
is, intensities are uniformly distributed. However, in 
practice, ttiis is very uncommon. Moreover, ttie scheme 
may not be robust to randomly jittering the intensity lev- 

20 els by a single unit, and be extremely sensitive to geo- 
metric affine transformations. 

The seoond mettiod is called 'texture bkxk coding', 
where a regton of random texture pattern found in the 
image is copied to an area of the image with similar tex- 

25 ture. Autoconrelation is ttien used to recover each tex- 
ture region. The most significant problem witti ttiis 
technk)ue is ttiat it is only appropriate tor innages ttiat 
possess large areas of random texture. The technk^ue 
could not be used on images of text for example. Nor is 

30 ttiere a direct analog for audia 

In addition to direct work on watermarking images, 
ttiere are several worta of interest in related areas. E.H. 
Adelson in U.S. Patent Na 4, 939,515 entitled "Digital 
Signal Encoding and Decoding Apparatus' descrbes a 

35 technk)ue for embedding digital infbrnnation in an ana- 
k>g signal for the purpose of inserting digital data into an 
analog TV signal. The analog signal is quantized into 
one of two disjoint ranges ({0,2.4...}. {1.3.5}. for exam- 
ple) which are selected based on the binary digit to be 

40 transmitted. Thus Adelson*s mettiod is equivalent to 
watermari< schemes ttiat encode information into ttie 
least significant t>its of the data or its transform coeffi- 
cients. Adelson recognizes ttiat ttie mettiod is suscepti- 
ble to noise and ttierefbre proposes an alternative 

45 scheme wherein a 2x1 Hadamard transform of the digi- 
tized analog signal is tatan. The differential coefficient 
of ttie Hadamard transform is offset by 0 or 1 unit prior 
to computing ttie inverse transform. This corresponds to 
encoding the waternwk into the least significant bit of 

50 ttie dWerential coefficient of the Hadamard transfam. It 
is not dear . ttiat ttiis approach woukl demonstrate 
enhanced resilierKe to noise. Furthermae. lite all such 
least significant bit schemes, an attacter can eliminate 
ttie watenmark by randomization. 

55 U.S. Patent tto. 5.010,405 descrbes a method of 
interieaving a standard MTSC signal within an 
enhanced definition televiskKi (EDTV) signal. This is 
accomplished by analyzing the frequency spectiiim of 
ttie EDTV signal (larger ttian ttiat of ttie NTSC signal) 
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and decomposing it into three sub-t)ands (UM.H for low. 
medium and high frequency respectively). In contrast, 
the NTSC signal is decomposed into two subbands, L 
and M. The coefficients, M^, within the M band are 
quantized into M levels and the high frequency coeffi- 5 
cients. of the EOTV signal are scaled such that the 
addition of the signal plus any noise present in the 
system is less than the minimum separation between 
quantization levels. Once more, the method relies on 
modifying least significant bits. Presumably, the mid- 70 
range rather than low frequencies were chosen 
because they are less perceptually significant. In con- 
trast, the method proposed in the present invention 
nxxJifies the most perceptually significant components 
of the signal. 15 

Finally, it should be noted that many, if not all. of the 
prior art protocols are not collusion resistant. 

Recently. Digimarc Corporation of Portiand. Ore- 
gon, has described work refenred to as signature tech- 
nology for use in identifying digital intellectual property. 20 
Their method adds or subtracts small random quantities 
from each pixels. Addition or subtraction is based on 
comparing a binary mask of N bits with the least signifi- 
cant bit (LSB) of each pixel. If the LSB is equal to the 
corresponding mask bit, then the random quantrty is zs 
added, othenvise it is subtracted. The watermark is 
extracted by first computing the difference between the 
original and watermarked Jmages and then by examin- 
ing the sign of the difference, pixel by pixel, to determine 
if it corresponds to the original sequence of addi- 30 
tions/subtractiona The Digimarc technique is not based 
on direct modifications of the image spectrum and does 
not make use of perceptual relevance. While the tech- 
nique appears to be robu^, it may be susceptUe to 
constant brightiiess offsets and to attacks based on 35 
exptoiting tiie high degree of local correlation present in 
an image. For example, randon^ switching the position 
of similar pixels within a local neighkx»hood may signif i- 
canfly degrade the watermark without damaging the 
image. 40 

In a paper by Hoch, Rindfrey and Zhao entitied 
"Copyright Protectk)n for Muttanedia Data", two general 
methods for watermarking images are descrfeed. The 
first m^hod partitions an image into 8x8 blocks of pixels 
and computes the Discrete Cosine Transform (DCT) of 4S 
each of these btocte. A pseudorandom subset of the 
blocks is chosen and in each such btock a triple of fre- 
quencies selected from one of 18 predetermined triples 
is modified so that their relative strengttis encode a 1 or 
0 value. The 18 possiile triples are composed by selec* so 
tion of three out of eight predetermined frequencies 
within the 8x 8 DCT block. The chofoe of the eight fre- 
quencies to be attered within the OCT bfock appears to 
be based on the belief thatmiddle frequencies have a 
moderate variance level, i.e.. they have sirrilar magni- ss 
tude. This prop^ needed in order to allow the rela- 
tive strength of the frequency triples to be attered 
without requiring a modification ttiat would be perceptu- 
ally noticeable. Unlike in the present invention, the set of 



frequencies is not chosen i3a8ed on any perceptual sig- 
nificance or relative energy considerations. In addition, 
because the variance between the eight frequency 
coefficients is small, one would G>qpecX that the tech- 
nique may be sensitive to noise or distortions. This is 
supported by tfie experimental results reported in ttie 
Koch et al p^>er, supra, where it is reported that the 
"embedded labels are robust against JPEG oon^res- 
sion for a quality iSactor as low as about S0%". In con- 
trast the method desafoed in accordance with the 
teachings of the present invention has been demon- 
^ted with compressfon quality factors as low as 5 per- 
cent. 

An eariier proposal by Koch and Zhao in a paper 
entitled Toward Robust and Hidden Image Copyright 
Labeling" proposed not triples of frec^endes but pairs 
of frequencies and was again designed specifically for 
robustness to JPEQ compression. Nevertheless, the 
report states that "a lower quality factor will inaease the 
likelihood that the changes necessary to superinpose 
the embedded code on the signal will be noticeably vis- 
We". 

In a second method, proposed by Koch £ihd Zhao, 
designed for black and white images, no frequency 
tBnsform is employed. Instead, the selected blocte are 
modified so that the relative frequency of white and 
black pixels encodes the final valua Both watermarking 
procedures are particulariy vulnerable to multiple docu- 
ment attacks. To protect against this. Zhao and Koch 
proposed a distrbuted 8x8 block of pixels created by 
randomly sampling 64 pixels from the image. However, 
the resulting OCT has no relationship to that of the true 
image. Consequently, one woUd expect such distrib- 
uted blocks to be both sensitive to noise and likely to 
cause noticeable artifacts in the image. 

In summary, prfor art digital watermarking tech- 
niques are not robust and ^e watermark is easy to 
remove. In addition, many prior techniques wouU not 
survive common signal and geometric dtetortions 

The present invention overcomes the limitations of 
the prior art methods by providing a watermarking sys- 
tem that embeds an unique identifier into the perceptu- 
ally significant components of a decomposition of an 
image, an audio signal or a video sequenca 

Preferably, the decomposition is a spectral fre- 
quency deoompositfon. The watermark is ent)edded in 
the data's perceptually signif kant frequency compo- 
nents. This is t)6cause an effective watermark cannot 
be located in perceptually insignificant regions of image 
data or in its frequency spectrum, since many common 
signal or geometric (mcesses affect these components. 
For example, a watermark located in the high frequency 
spectral components of an image is easily rmxved, 
with minor degradation to the image, by a process that 
performs low pass filtering. The issue then becomes 
one of how to insert the watermark into the most signif- 
icant regions of the data frequency spectrum without 
the alteration being noticeable to an observer. i.a. a 
human or a machine feature recognition system. Any 
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spectral component may be altered, provided the aKer- 
ation is small. However very small alterations are sus- 
ceptbte to any noise present or intentional distortion. 

In order to overcome this problem, the frequency 
domain of the image data or sound data may be consid- s 
ered as a communication channel, and conrespondngly 
the watermark may be considered as a signal transmit- 
ted through the channel. Attacks and intentional signal 
distortions are thus treated as noise from which the 
transmitted signal must be immune. Attacks are inten- 10 
tional efforts to renrvTve, delete or otherwise overcome 
the beneficial aspects of the data watermarking. While 
the present invention is intended to embed watermarks 
in data, the same methodology can be applied to send- 
ing any type of message through media data. is 

Instead of encoding the watermark into the least 
significant components of the data, the present inven- 
tion conskiers applying concepts of spread spectrum 
communicatk)n. In spread spectrum communicators, a 
narTQwt)and signal is transmitted over a much larger 20 
bandwUth such that the signal energy present in any 
single frequency is imperceptible. In a similar manner, 
the watermark is spread over many frequency bins so 
that the energy in any single bin is small and impercep- 
t^e. Since the watermark verificatk)n process includes 2S 
a priori knowledge of the locations and content of the 
watermarks, it is possi3le to concentrate these many 
weak signals into a single signal with a high sgpsi to- 
noise ratia Destruction of such a watermark wouM 
require noise of high amplitude to be added to every fre- jo 
quency bin. 

In accordance with the teaching of the present 
invention, a watermark is inserted into the perceptually 
most significant regions of the data decompositioa The 
watermark itself is designed to appear to be additive 3S 
random noise and is spread throughout the imaga By 
placing the watermark into the perceptually significant 
components, it is much more difficult for an attacker to 
add more noise to the components without adversely 
affecting the image or other dat& It is the fact that the 40 
watermark looks like noise and m spread throughout the 
image or data which makes the present scheme appear 
to be similar to spread spectrum methods used in com- 
munications system. 

Spreading the watermark throughout the spectrum 4$ 
of an image msures a large nteasure of security 
against unintentk)nal or intentk)nal attack. Rrst the 
location of tfie watermark is not obvious. Second, fre- 
quency regions are selected in a fashk>n that ensures 
severe degradatton of the original data foltowing any so 
attack on the watermark. 

A watermark that is well ptaced in the frequency 
don^ain of an image or a sound track will be practically 
impossible to see or hear. This will always be the case if 
the energy in the watemwk is sufficiently small in any 55 
single frequency coefficient. Moreover, it is possible to 
ino'ease the energy present in particular frequencies by 
exploiting knowledge of masking phenomena in the 
human auditory and visual systems. Perceptual mask- 



ing refers to any situatfon where information in certain 
regk)ns of an image or a sound is occluded by percep* 
tually more prominent information in another part of the 
image or sound. In digital waveform coding, this fre- 
quency domain (arxj in some cases, time^ixel domain) 
masking is exploited extensively to achieve low bit rate 
encoding of data It is dear that both auditory and visual 
systems attach more resolution to the high energy, low 
frec^ency. spectral regksns of an auditory or visual 
scene. Further, spectrum analysis of images and 
sounds reveals that most of the information in such data 
is often located in the kw frequency regk}na 

In addition, particularly for processed or com- 
pressed data, perceptually significant need not refer to 
human perceptual significance, but may refer instead to 
machine perceptual significance, for instance, machine 
feature recognition. 

To meet these requrements. a watermark is pro- 
posed whose stmcture comprises a large quantity, for 
instance 1000. of randomly generated numbers with a 
normal cSstribution having zero mean and unity vari- 
ance. A binary watermark is not chosen because it is 
much less robust to attacks based on collusion of sev- 
eral independently watermarked copies of an image. 
However, generally, the watermark might have art>itrary 
^cture, both deterministic and/or random, and includ- 
ing uniform distributjona The length of the proposed 
watermark is variable arxj can be adjusted to suit the 
characteristfos of the data. For example, tonger water- 
marks might be used for images that are especially sen- 
sitive to large nrxxiif icattons of its spectral coefficients, 
thus requiring weaker scaling factors for indivkJual com- 
ponents. 

The watermark is then placed in components of the 
Image spectrum. These components may be chosen 
based on an analysis of those components which are 
most vulnerable to attack and/or wftich are most percep- 
tually significant Thte ensures that the watermark 
remains with the image even after common signal and 
geom^ distortions. Modificatton of these spectral 
components results in severe image degradation long 
before the watermark itself is destroyed. Of course, to 
insert the watermark, it is necessary to alter these very 
same coeffidenta However, each modification can be 
extremely small and. in a manner similar to spread 
spectrum convmjnicatfon, a strong narrowband water- 
mark may be distributed over a much broader image 
(channel) spectrum. Conceptually, detection of the 
watermark then proceeds by adding all of these very 
small signals, whose locations are only known to the 
copyright owner, and concentrating the watermark into 
a signal with high signal-to-noise ratia Because the 
locatfon of the watermark is only known to the copyright 
holder, an attacker would have to add very nujdi more 
nase energy to each spectral coeffident in order to be 
confUent of removing the watermark. However, this 
process woukJ destroy the imaga 

Preferably, a predetermined numt>er of the largest 
coeffidents of the OCT (disaete cosine transform) 
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(excluding the DC term) are used. However, the choice 
of the OCT 18 not critical to the algorithm and other 
spectral transforms, including wavelet type decomposi- 
tions are also possible. In fact, use of the FFT rather 
than DOT is preferable from a computational perspec- 
tive. 

The invention will be more deariy understood when 
the following description is read in conjunction with the 
accompanying drawing. 

Rgure 1 is a schematic representation of 

typical common processing oper- 
ations to which data could be 
SLbjected; 

Rgure 2 is a schematic representation of a 

preferred system for immersing a 
watermark into an image: 

Rgures 3a and 3b are flow charts of the encoding 
and decoding of watenrarks: 

Rgure4 is a ^^h of the responses of the 

watermark detector to random 
watermarks; * 

Rgure 5 is a graph of the response of the 

watermark detector to random 
watermarks for an image which is 
successively watermari<ed five 
times; 

Rgure 6 is a graph of the response of the 

watermark detector to random 
watermarks where five images, 
each having a different water- 
mark, and averaged tog^er; 
and 

RgureT is a schematic diagram of an opti- 

cal ennbodiment of the present 
inventk)n 

In Older to better understand the advantages of the 
inventton, the preferred embodiment of a frequency 
spectrum based watermarking system will be 
descriljed. It is instructive to examine the processing 
stages that image (or sound) data may uhdergo in the 
copying process and to consider the effect that such 
processing stages can have on the data Referring to 
Rgure 1. a watermarked image or sound data 10 is 
transmitted 12 to undergo typical dstortfon or inten- 
ttonal tampering 14. Such distortions or tampering 
includes lossy compression 1 6. geometric distortion 1 8, 
signal processing 20 and D/A and A/D conversion 22. 
After undergoing dtetortion a tampering, conxpted 
watermariied image or sound data 24 is transmitted 26. 
The process of transmission'' refers to the application 
of any source or channel code and/or of enayptk>n 



techniques to the data. While most transmission steps 
are information lossless, many compression schemes 
(e^g.. JPEQ. MPEQ. etc.) may potentially degrade the 
quality of the data through irretrievable loss of data. In 

5 general, a watermarking method shoukJ be resilient to 
any distortions introduced by transmission or compres- 
sion algorithms. 

Lossy compression 16 is an operation that usually 
eliminates perceptually irrelevant components of image 

10 or sound data. In ader to preserve a watermark when 
undergoing lossy compressfon, the watermark is 
located in a perceptually signiftoant region of the data. 
Most processing of this type occurs ia ttie frequency 
domain. Data loss usually occurs in ttie high frequency 

IS components. Thus, the watermark must be placed in 
the significant frequency component of the image (or 
sound) data spectrum to minimize ttie adverse affects of 
lossy compressfon. 

After receipt, an image may encounter many com- 

20 mon transformations ttiat are broadly categorized as 
geometric distortions or signal dtetprtions. Qeom^ 
distortions 1 8 are specific to image and video data, and 
include such operations as rotation, translation; scaling 
and cropping. By manually determining a minimum of 

2S four or nine corresponding points between ttie original 
and ttie distorted watermark, it is possi3le to remove 
any two or ttiree dimensfonal affine transformation. 
However, an affine scaling (sfvinking) of ttie image 
results in a loss of data in ttie high frequency spectral 

30 regions of the imaga Cropping, or the cutting out and 
removal of portior^ of an image, also results in irretriev- 
able loss of data. Cropping may be a serious ttireat to 
any spatially based watermaric but is less likely to affect 
a frequency-based scheme. 

35 Common signal distortions include digttal-to-anafog 
and analog-to-digital conversfon 22, resampling, 
requantization. including dtttiering and recompression, 
and common signal enhancements to image contrast 
and/or color, and audfo frequency equalization. Many of 

40 ttiese di^ortions are non-linear, and it is difficult to ana- 
lyze ttieir effect in eittier a spatial or frequency based 
mettiod. However, ttie fact ttiat ttie original image is 
taiown allows many signal transformations to be 
undone, at least approximately For example, histogram 

45 equalization, a common non-linear contrast enhance- 
ment mettiod. may be substantially rrnicved by Myo- 
gram specification or dynamfo histogram warping 
. techniques. 

Rnally. the copied image may not remain in digital 

so form. Instead, it is likely to be printed or an analog 
recording made (anafog oudfo or vktoo tape). These 
reproductions introduce additional degradation into ttie 
image data ttiat a watermarking scheme must be rotxist 
to 

55 Tampering (or attack) refers to any intentional 
attempt to remove ttte watermaric. or corrupt it beyond 
recognition. The watermark must not only be resistant 
to the inadvertent application of distortions, tt must also 
be inunune to intentional manipulation by malicious par- 
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ties. These manipulations can include combinations of 
distortions, and can also indude collusion and forgery 
attacks. 

Rgure 2 shows a preferred system for inserting a 
watermark into an image in the frequency domain. 
Image data X(i j) assumed to be in digital form, or alter- 
natively data in other famats such as photo^aphs, 
paintings or the like, that have been previously digitized 
by well-krK>wn methods, is siAject to a frequency trans- 
formation 30. such as the Fourier transform. A water- 
mark signal W (k) is inserted into the frequency 
spectrum components of the transformed image data 
32 applying the technk^ues described below. The fre- 
quency spectrum image data including the watermark 
signal is subjected to an inverse frequency transform 
34. resulting in watermarked image data X (i, j). whkii 
may remain in digital form or be printed as an analog 
representation by well-known methods. 

After applying a frequency transformation to the 
image data 30. a perceptual mask is computed that 
highlights prominent regk)ns in the frequency spectrum 
capable of supporting the watermark without overly 
affecting perceptual fidelity. This may be performed by 
using knowledge of the perceptual significance of each 
frequency in the spectrum, as discussed earlier; or sim- 
ply by ranking the frequencies based on their energy. 
The latter method was used in experiments descrflDed 
below. 

In general, it is desired to place the watermark in 
regions of the spectrum that are least affected by com- 
mon sigial distortions and are most signif rcant to image 
quality as perceived by a viewer, such that significant 
modtfk:ation wouU destroy the image fkJelity. In prac- 
tk:e. these regtons couM be experimentally Mentified by 
applying conrtmon signal dmtorticHis to images and 
examining whk:h frequencies are most affected, and by 
psychophysical studies to identify how much each com- 
ponent may be modified before significant changes in 
the image are perceivable. 

The watermark signal is then inserted into these 
prominent regfons in a way that makes any tampering 
create visUe (or audUe) defects in the data. The 
requirements of the watemwk mentfoned above and 
the distortk)n8 common to copying provide constraints 
on the design of an electronic watermark. 

In order to better understand the watermaridng 
method, reference is made to Rgures 3(a) and 3(b) 
where from each document 0 a sequence of values 

X«x^ is extracted 40 with which a watermark 

W-w^...;.w„ is combined 42 to create an adjusted 

sequence of values X'>x' ^ x* „ which is then inserted 

t>ack 44 into the document in place of values X in ofder 
to obtain a watermark document 0*. An attack of the 
document D*. or other distortk>n. will produce d docu- 
ment D*. Having the original document 0 and the docu- 
ment D*. a possUy corrupted watermark W* is 
extracted 46 and conpared to watermark W 48 for sta- 
tistical analysis SO. The values W* are extracted by first 
extracting a set of values X*ax ^ * x„' from 0' (using 



information about D) and then generating W* from the 
values X* and the values X. 

When combining the values X with the watermark 
values W in step 42. scaling parameter a is specified. 
5 The scaling parameter a determines the extent to which 
values W alter values X. Three preferred formulas for 
computing X* are: 

x' i^Xj^aWf (1) 

TO 

x*,-x,(1+aw,) (2) 

,^ x',-x,(e""'i (3) 

Equation 1 is inverttt)le. Equations 2 and 3 are 
invertiUe when Xj^. Therefore, given X* it is possible to 
compute the inverse function necessary to derive 

20 from X and XV 

Equation 1 is not the preferred fbrmda when the 
values Xj vary over a wMe range. For example, if x^^Q^ 
theri adding 1 00 may be insufficient to establish a water- 
mark, but if Xj«10, then adding 100 win unacceptably 

25 dtetort the value. Insertfon methocte using equations 2 
and 3 are more robust when encountering such a wide 
range of values Xj. It will also be observed that equatton 
2 and 3 yiekJ similar results when cewj is small. Moreo- 
ver, when Xi is positive, equatfon 3 is equivalent to 

30 ln(x,)ain(x,)+ax, and may be conskiered as an appli- 
cation of equation 1 when natural logarithms of the orig- 
inal values are used. For example, if |w;^^1 and a«0.01, 
then using Equation (2) guarantees that the spectral 
coeffk:ient will change by no mae than 1%. 

35 For certain applications, a single scaling parameter 
a may not be best for combining all values of Xj. There- 
fore, multiple scaling parameters q On can be used 

with revised equatfons 1 to 3 such as X|«x, (1-i>a|W,) . 
The values of 04 serve as a relative measi^e of how 

40 much Xj imA be altered to change the perceptual qual- 
ity of the docum^ A large value for q means that it is 
possUe to alter Xj t^y a targe amount without perceptu- 
ally degracfing the document 

A method for selecting the multiple scaling values is 

45 based upon certain general assumptfons. For example, 
equatfon 2 is a special case of the generalized equatfon 
1. (X|'-x,+ a,x,). fa a|«ax,. That is , equation 2 
makes the reasonable a88un^)tfon that a large value of 
Xj is less sensitive to adcfitive alt^atfon that a small 

so value of Xj. 

Qenerally. the ser^vity of the image to different 
values of Of is unkrown. A method of empirically esti- 
mating the sensitivities is to determine the distortion 
caused by a number of attacks on the original image. 

55 Fa example, it is possble to compute a degraded 
image D* from D. extract the corresponding values 
x^* Xn* and select Oj to be proportfonal to the devia- 
tion |Xj*-Xi|. For greater robustness, it is poss&ile to try 
other forms of distortion arxj make Oi proportional to the 
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average value of |Xi**Xj|. Instead of using ttie average 
duration, tt is possible to use the median or maximum 
deviation. 

Alternatively, it is possible to combine the empirical 
approach with general global assumptions regarding 
the sensitivity of the values. For example, it might be 
required that Of^Oj whenever x^. This can be com- 
bined with the empirical approach by setting Oj accord- 
ing to 



a. - max Jv) -V.I 



A more scphtsticated approach is to weaken the 
monotonictty constraint to be robust against occasional 
outliers. 

The length of the watermark, n, determines the 
degree to whk:h ttie watermark is spread among the rel- 
evant components of the image data. As the size of the 
watermark increases, so does the number of altered 
spectral components, and the extent to which each 
conrponent need be altered decreases for the same 
resilience to noise. Consider watermarks of the form 
x,'a>x,-i-aw, and a white noise attack by x,*>x,'^,, 
where n are chosen according to independent normal 
distrbutions with standard deviation a. ft is possible to 
recover the watermark when d is proportional to 



'A- 



That is. quadrupling the number of components can 
halve ttie ntagnitude of the watermark placed into each 
componerit. The sum of the squares of the donations 
renuuns essentially unchanged. 

In general, a watermcrk comprses an artiitrary 

sequence of real mmtbers W^w, w^. In practice, 

each value W) may be chosen independenfly from a nor- 
mal dstrttxjtton N(0. 1 ). where NOi, o^ witti mean )i and 
variance o^ or of a unflbrm distribution from {1,-1} or 
{0.1}. 

It is highly unlikely ttiat the extracted mark W* wUI 
be identical to the orif^waternwkW. Even the act of 
requantizing the watermarked document for transmis- 
sfon will cause \Ar to deviate from W. A preferred meas- 
ure of ttie sinvlarityof WandVtr is 
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whatever value of is obtained, ttie conditional distri- 
bution on Wi will be independentiy distributed according 
to A/(0.1). In this case. 



•2 



A/(0.£x^)-A/(O.W • IV). 



/.I 



Thus. sim( W. V\r) is distributed according to A/(0. 1 ). 
Then, one may apply the standard significance tests for 
ttie nomnal distrbution. For example, if 0* is chosen 
independentiy from W, then it is very unlikely ttiat 
8im( W, W*)>S. ttoie ttiat somewhat higher values of sim 
( W.IV*) may be needed when a large number of water- 
marks are on file. The above analysis required only ttie 
independence of W from and did not rely on any 
specific properties of Vir itself. This fact provides further 
flexfoility when preprocessing IV*. 

The extracted watermark may be extracted in 
several ways to potentially enhance the ability to extract 
a watermark. For example, experiments on images 
encountered instances where ttie average value of iV*. 
denoted £/ (W*), differed substantially from 0, due to 
ttie effects of a dithering procedure. While ttiis artifact 
could be easily eliminated as part of the extraction proc- 
ess, it provMes a nrxstivation for postprocessing 
extracted watermarte. As a result, it was discovered 
ttiat ttie simple transformation Wi* ^Wj^-EiiW*) yiekJed 
superior values of sim (IV. 1^*). The improved perform- 
ance resulted from the decreased value of ^ • IV; the 
value iriW'W was only slighfly affected. 

In experiments it was frequentiy observed ttiat wf 
could be greatiy distorted for some values of /. One 
postprocessing option is to simply ignore such values, 
setting ttiem to 0. That is, 



w* if jwf j > tolerance 
0 otherwise 



so 



The goal of such a transformation is to lower W* • 
A less abrupt versfon of ttiis approach is to normal- 
ize ttie VV* values to be erttier- 1.0 on. by 

w 4-8ign(w 'Ei{W)l 



sim(IV.lV*). 



Large values of sim (IV.IV*) are significant in view 
of the following analysis. Assume ttiat the auttxvs of 
document 0* had no access to W (eittier ttirough ttie 
seller or ttirough a watermariced document). Then for 



This transformation can have a dramatic effect on 
ttie statistical signif k»nce of ttie result Ottier robust sta- 
55 tistical technques coukJ also be used to suppress out- 
lier effects. 

In principle, any frequency domain transform can 
be used. In the scheme desoibed below, a Fourier 
domain mettiod is used, but the use of wavelet based 
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schemes are also useable as a variation. In terms of 
selecting frequency regions of the transform, it is possi- 
ble to use models for the perceptual system under con- 
sideration. 

Frequency analysm may be performed by a wavelet s 
or sub-band transform where the signal is divided into 
sub-bands by means of a wavelet or multi-resolution 
transform. The sub-bands need not be uniformly 
spaced. Each sub-band may be thought of as repre- 
senting a frequency regk)n in the donr^n con'esponding 10 
to a sub*region of the frequency range of the signal. The 
watermark is then inserted into the sub-regions. 

For audio data, a sliding "window" moves along the 
signal data and the frequency transform (DOT FFT 
etc.) is taken of the sample in the window. This process fs 
enables the capture of meaningful information of a sig- 
nal that is time varying in nature. 

Each coefficient in the frequency domain is 
assitmed to have a perceptual capacity. That is. it can 
support the insertion of additional information without 20 
any (or with minimal) impact to the perceptual fidelity of 
the data. 

In order to place a length L watermark into an N x N 
image, the N x N FFT (or DOT) of the image is conv 
puted and the watermark is placed into the L* highest 25 
magnitude coefficients of the transform matrix, exclud- 
ing the DC component Mae generally, L randomly cho- 
sen co^kaents could be chosen from the M. most 
perceptually significant coefficients of the transform. For 
nrKSSt innages. these coefftctents will be the ones corre- 30 
spending to the Ujh frequendes. The purpose of placing 
the watermark in these locations is because signifk:ant 
tampering with these frequencies will destroy the image 
fkielity or perceived quality well before the watermark is 
destroyed. 3S 

The FFT provides perceptually similar results to the 
DOT. This is different than the case of transform coding, 
where the OCT is prefeaed to the FFT due to its spec- 
tral properties. The OCT tends to have 1^ high fre- 
quency information than that the FFT and places most 40 
of the image information in the tow frequency regk}ns. 
making it preferable in situations where data need to be 
eliminated. In the case of waterrharking. image data is 
preserved, and nothing is eliminated. Thus the FFT is 
as good as the OCT. and is preferred since it is easier to 4S 
compute. 

In an experiment, a visually imperceptUe water- 
mark was intentionally placed in an image. Sut)se- 
quentty. 100 randomly generated watermarks, only one 
of whk:h corresponded to the correct watermark, were so 
applied to the watermark detector described above. The 
result as shorn in Rgure 4. was a very strong positive 
response ooaesponding to the coaect watemrtark. sug- 
gesting ttiat the method results in a very low nunrt^er of 
false positive responses and a very low false negative 55 
response rate. 

In another test, the watermarked image was scaled 
to half of its original size, in order to recover the water- 
mark, the image was re-scaled to its original size, albeit 



with loss of detail due to subsampling of the image 
using low pass spatial filter operations. The response of 
the watermark detector was well above random chance 
levels, suggesting that the watermark is robust to geo- 
metric distortions. This result was achieved even though 
75 percent of the original data was missing from the 
scaled down image. 

In a further experiment, a JPEG encoded version of 
the image with parameters of 10 percent quality and 0 
percent smoothing, resulting in visfale distortions, was 
used. The results of the watermark detector suggest 
that the method is robust to common encoding distor- 
ttons. Even using a version of the image with parame- 
ters of the 5 percent quality and 0 percent smoothing, 
the results were wdl above that achievable due to ran- 
dom chance. 

In e3q3eriments using a dithered version of the 
image, the response of the watermark detector sug- 
gested that the method is robust to common encoding 
distortion. Moreover, more reliable detection is achieved 
by removing any non-zero mean from the extracted 
watermark. 

In another experiment the image was dpped, leav- 
ing only the central quarter of the image. In order to 
extract the watennark from the dipped image, the miss- 
ing portion of the image was replaced with portions from 
the original unwatermariwl image. The watemiark 
detector was able to recover the watermark with a 
response greater than random. When the non-zero 
ttmn was removed, and the elements of the watermark 
were binarized prior to the comparison with the correct 
watermark, the detector response was improved. This 
result is achieved even though 75 percent of the data 
was renrxTved from the image. 

In yet another experiment the image was printed, 
photocopied, scanned using a 300 4)i Umax PS-2400x 
scanner and rescaled to a size of 256 x 256 pixels. 
Cleariy, the final image suffered from different levels of 
dmtortion introduced at each process. High frequency 
pattern noise was partk^ularty noticeable. When the 
non-zero mean was removed and only the sign of the 
elements of the watermari( was used, the watermark 
detector response improved to well above random 
chance levels. 

In still another experiment, the image was subject 
to five successive watermaridng operattons. That is. the 
original image was watermarked, the watermarked 
image was waternwked. and so forth. The process 
may be considered another form of attack in whk:h it is 
dear that significant image degradation occurs if the 
process is repeated. Figure 5 shows the response of the 
watermari( detector to 1000 randontfy generated water- 
mari<s. induding the five watermarks present in the 
image. The five dominant spikes in the graph, indicative 
of the presence of the five watermari®, show that suc- 
cessive watermaridng does not interfere with the proc- 
ess. 

The fact that successive watermarking is possible 
means that the history or pedigree of a document is 
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determinable rf successive watermarking is added with 
each copy. 

In a variation of the multiple watermark image, five 
separately watermarked images were averaged 
together to sinuilate simple conclusion attack. Rgure 6 
shows the response of the watermark detector to 1000 
randomly generated watermarks, including the five 
waterrnarks present in the original images. The result is 
that simple collusion based on averaging is ineffective in 
defeating the present waterniarking system. 

The result of the above experiments is that the 
described system can extract a rdiaUe copy of the 
watermark from images that have been significantly 
degraded through several common geometric and sig- 
nal processing procedures. These procedures include 
zooming Qow pass filtering), aopping, k)ssy JPEG 
encoding, dithering, printing, photocopying and subse- 
quent rescanning. 

While these experiments were, in fact, conducted 
using an image, similar resists are attainable with text 
images; audio data and vUeo data, although attentk>n 
must be paid to the time varying nature of these data. 

The above implementatkKi of the watermarking 
system is an electronic system. Since the basic princi- 
ple of the inventkMi is the inciusk>n of a watermaricinto 
spectral frequency components of the data, watermark- 
ing can be accomplished by other means using, tor 
example, an optical system as shewn in Rgure 7. 

In Rgure 7, data to be watermarked such as an 
image 40 is passed through a spatial transform lens 42, 
such as a Fourier transform lens, the output of which 
lens is the spatial transform of the image. Concun'ently. 
a watermark image 44 is passed through a second spa- 
tial transfbrm lens 46. the output of which lens is the 
spatial transfer of the watemruuk image 44. The spatial 
transfbrm from lens 42 and the spatial transfbrm from 
lens 46 are conrfotned at an optk^ combiner 48. The 
output of the optical combiner 48 is passed through an 
inverse spatial transform lens SO from which the water- 
mark image 52 is present The result is a unk|ue. virtu- 
ally imperceptUe, watermarked image. Similar results 
are achievable by transmitting vkleo or multimedia sig- 
nals through the lenses in the manner desabed above 

Claims 

1. A method of ins^ting a watermartc into data com- 
prising the steps of: 

obtaining a deoornpositfon of data to be water- 
marked; 

inserting a watermark into the perceptually sig- 
nificant components of the decompositfon of 
data: and 

applying an inverse transform to the decompo- 
sition of data with the watermark for generating 
watermarked data. 

2. A method of inserting a watermark into data as set 



forth in daim 1, said obtaining a decompositfon of 
data being obtaining a spectral decomposition of 
data. 

5 3. A method as set forth in claim 1 or 2. where said 
data comprising image data, video data, aucfio data 
and/or multimedia data. 

4. A method as set forth in claim 2 or 3. where said 
10 obtaining a spectral decomposition of data is 
selected from the group considing of RHirier trans- 
formation, discrete cosine transformation. Had- 
amard transformation, and wavelet multi- 
resolution. sub-t)arid method. 

15 ' . 

• 5. A method as set forth in any one of claims 1 to 4, 
where said inserting a watermark inserts water-: 
mark values so that addition of additional signal into 
a perceptually significant component affects ttie 

20 perceived quality of the data. 

6. A method as set forth in any one of claims 2 to S. 
further conprising: 

2S comparing data with watennarked data for 

obtaining extracted data values; 
comparing extiacted data values with water- 
mark values and data for obtaining difference 
values: and 

JO analyzing difference values to determine the 

watermark in the watermarked data. 

7. A method as set fbrth in daim 6. where the water- 
mark values are chosen according to a normal ds- 

35 trfoution. 

8. A method of inserting a watermark into data com- 
prising ttie steps of: 

40 extracting values of perceptually signiffoant 

corrponents of a spectral decomposition of 
data: 

combining watermark values wHh ttie extracted 
values to create adjusted values; and 
45 inserting the adjusted values into tiie data in 

place of the extracted values to produce water- 
mart(Bddata. 

9. A rnethod of inserting a watermark into data as set 
50 forth in daim 8, where ttie watermark values are 

chosen aocorcfing to a random distribution. 

ia Themettx)dassetforthin anyoneofdaims6to9. 
Where watermark values indude assodated scaling 
55 parameters. 

11. A method as set forth in daim 10, where scaling 
parameters are selected such that adding addi- 
tional watermark value affects the perceived quality 
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of the data. 

12« A method as set forth in any one of claims 8 to 11. 
further comprising: 

comparing data with watermarked data for 
obtaining extracted data values: 
comparing extracted data values with water- 
marked values and data for obtaining differ- 
ence values; and 

analyzing difference values to determine the 
watermark in the watermarked data. 

13. A method of inserting a watermark into data as set 
forth in claim 12, further comprising the st^ of pre- 
processing distorted or tampered watermarked 
data befae said comparing data. 

14. A method of inserting a watermark into data as set 
forth in daim 13. where said cGstorted or tampered 
watemiariied data is clipped data and said preproc- 
essing comprises replacing missing portions of the 
data with corresponding portions from original 
unwatermart^ed data. 

15. A system for inserting a watermark into data con)- 
prising: 

providing image data: 

provkling watennark image data; 

first transform lens for transfaming image data 

passing therethrough into transformed infiage 

data: 

seoond transform lens fbr transforming water- 
marie image data passing therethrough into 
transformed watenrark image data; 
optical comUner for combining the transformed 
image data and the transformed watermark 
image data to form transformed watermarked 
data: and 

inverse transform lens for forming watermarked 
data by inverse transformatfon of transformed 
watermarked data. 

16. A system for inserting a watermark into data as set 
forth in daim 1 5. where saM first transform lens and 
said second transform lens are Fourier transfomfi 
lenses and saki inverse transform lens is an inverse 
Fourier transform lens. 

17. A method of inserting a watermark into data com- 
prising the steps of: 

obtaining a decomposition of data to be water- 
marked: 

modifying the data to be watermarked by sub- 
jecting the data to distortion and/or tampering: 
obtaining a decomposition of the modified data; 
cornparing the componerns of the deoomposi- 



tfon of data to be watermarked with the compo- 
nents of the decomposition of the modified 
data; and 

insertir)g a watermark into the data to be water- 
5 mariced based upon sakl comparing. 
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